NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Time-Aware Knowledge Representations of Dynamic Objects with Multidimensional Persistence

Coskunuzer, Baris; Segovia-Dominguez, Ignacio; Chen, Yuzhou; Gel, Yulia (February 2024, Proceedings of 38th AAAI Conference on Artificial Intelligence)
Wooldridge, Michael (Ed.)
Learning time-evolving objects such as multivariate time series and dynamic networks requires the development of novel knowledge representation mechanisms and neural network architectures, which allow for capturing implicit timedependent information contained in the data. Such information is typically not directly observed but plays a key role in the learning task performance. In turn, lack of time dimension in knowledge encoding mechanisms for time-dependent data leads to frequent model updates, poor learning performance, and, as a result, subpar decision-making. Here we propose a new approach to a time-aware knowledge representation mechanism that notably focuses on implicit timedependent topological information along multiple geometric dimensions. In particular, we propose a new approach, named Temporal MultiPersistence (TMP), which produces multidimensional topological fingerprints of the data by using the existing single parameter topological summaries. The main idea behind TMP is to merge the two newest directions in topological representation learning, that is, multi-persistence which simultaneously describes data shape evolution along multiple key parameters, and zigzag persistence to enable us to extract the most salient data shape information over time.We derive theoretical guarantees of TMP vectorizations and show its utility, in application to forecasting on benchmark traffic flow, Ethereum blockchain, and electrocardiogram datasets, demonstrating the competitive performance, especially, in scenarios of limited data records. In addition, our TMP method improves the computational efficiency of the state-of-the-art multipersistence summaries up to 59.5 times.
more » « less
Full Text Available
GLDL: Graph Label Distribution Learning

Jin, Yufei; Gao, Richard; He, Yi; Zhu, Xingquan (February 2024, Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI))
Wooldridge, Michael; Dy, Jennifer; Natarajan, Sriraam (Ed.)
Full Text Available
Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains

https://doi.org/10.1609/AAAI.V38I17.29933

Zhang, Yu; Zhang, Yunyi; Shen, Yanzhen; Deng, Yu; Popa, Lucian; Shwartz, Larisa; Zhai, ChengXiang; Han, Jiawei (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)
Wooldridge, Michael J; Dy, Jennifer G; Natarajan, Sriraam (Ed.)
Accurately typing entity mentions from text segments is a fundamental task for various natural language processing applications. Many previous approaches rely on massive human-annotated data to perform entity typing. Nevertheless, collecting such data in highly specialized science and engineering domains (e.g., software engineering and security) can be time-consuming and costly, without mentioning the domain gaps between training and inference data if the model needs to be applied to confidential datasets. In this paper, we study the task of seed-guided fine-grained entity typing in science and engineering domains, which takes the name and a few seed entities for each entity type as the only supervision and aims to classify new entity mentions into both seen and unseen types (i.e., those without seed entities). To solve this problem, we propose SEType which first enriches the weak supervision by finding more entities for each seen type from an unlabeled corpus using the contextualized representations of pre-trained language models. It then matches the enriched entities to unlabeled text to get pseudo-labeled samples and trains a textual entailment model that can make inferences for both seen and unseen types. Extensive experiments on two datasets covering four domains demonstrate the effectiveness of SEType in comparison with various baselines. Code and data are available at: https://github.com/yuzhimanhua/SEType.
more » « less
Full Text Available

Search for: All records